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Abstract 

These notes cover the first eight lectures of the class Many Models 
of Complexity taught by Laszlo Lovasz at Princeton University in the 
Fall of 1990. The first eight lectures were on evasiveness of graph prop- 
erties and related topics; subsequent lectures were on communication 
complexity and Kolmogorov complexity and are covered in other sets 
of notes. 

The fundamental question considered in these notes is, given a func- 
tion, how many bits of the input an algorithm must check in the worst 
case before it knows the value of the function. The algorithms consid- 
ered are deterministic, randomized, and non-deterministic. The func- 
tions considered are primarily graph properties — predicates on edge 
sets of graphs invariant under relabeling of the edges. 
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Figure 1: A Simple Decision Tree 



1 Decision Trees and Evasive Properties 

The goal of this course is to examine various ways of measuring the complex- 
ity of computations. In this lecture, we discuss the decision tree complexity 
of functions. We begin a characterization of which functions require that for 
any deterministic algorithm for computing the function, there is some input 
for which the algorithm checks all the bits of the input. 

1.1 Decision Trees 

A decision tree is a tree representing the logical structure of certain algo- decision tree 
rithms on various inputs. The nodes of the tree represent branch points of 
the computation — places where more than one outcome are possible based 
on some predicate of the input — and the leaves represent possible out- 
comes. Given a particular input, one starts at the root of the tree, performs 
the test at that node, and descends accordingly into one of the subtrees of 
the root. Continuing in this way, one reaches a leaf node which represents 
the outcome of the computation. 

Given a function / : {0, 1}™ — > {0, 1}, a simple decision tree for the simple decision tree 
function is a binary tree whose internal nodes have labels from {1,2,..., n} 
and whose leaves have labels from {0, 1}. If a node has label i, then the test 
performed at that node is to examine the iih bit of the input. If the result is 
0, one descends into the left subtree, whereas if the result is 1, one descends 
into the right subtree. The label of the leaf so reached is the value of the 
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function on the input. 

While it is clear that any such function / has a simple decision tree, we 
will be interested in simple decision trees for / which have minimal depth 

decision tree complexity D{f). D(f) is called the decision tree complexity of f. 

of / It is clear that D (/) is at most the number of variables of /. A simple ex- 

ample which achieves this upper bound is the parity function f(xi, ... , x n ) = 

x\ + X2 H h x n mod 2. For this function every leaf of any simple decision 

tree for / has depth n, because if the value of some X{ has not been examined 
by the time a leaf is reached for an input x, the tree gives the same answer 
when Xi is flipped, so the function computed is not parity. 

1.2 An Evasive Function 

evasive A function / with D(f) equal to the number of variables is said to be evasive. 

A less trivial example of an evasive function is 

f(xij :i,je {1, ...,n})= f\\f Xij, 

i j 

that is / is 1 iff every row of the matrix with entries x 8J has at least one 1 . 
adversary argument To show / is evasive, we use an adversary argument. We simulate the 

computation of some decision tree, except instead of checking the bits of 
the input directly, we ask the adversary. The adversary, when asked for the 
value of x^, responds as long as some other variable in the row remains 
undetermined, and 1 otherwise. In this way the adversary maintains that the 
value of the function is undetermined until all variables have been checked. 
Note that in the case we show only that some leaf is of depth n 2 . 

1.3 A Non-Evasive Function 

Next we give a non-trivial example of a non-evasive function. Given players 
1, 2, . . . , n, let x^ : 1 < i < j < n be 1 if player i will beat player j if they 
play each other and if j will beat i. (No draws allowed. Note that this 
is not necessarily a transitive relation.) The function is 1 iff there is some 
player who will beat everyone. 

The object is to determine / without playing all possible matches. To 
do this, first play a "knockout tournament" — have 1 and 2 play, have the 
winner play 3, have the winner play 4, etc. until every player but some player 
i has lost to somebody. Now play i against everyone he hasn't played. If 
i wins all his matches, / is 1, otherwise / is 0. The number of matches 
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played in the first stage is n — 1, and at most n — 2 are played in the 
second, so D(f) < 2n — 3. (How can one redesign the first stage to show 
£>(/)< 2n- Llog 2 nJ?) 

1.4 Non-Deterministic Complexity 

The basic idea behind these two examples is that for most functions (what 
are the two exceptions?) there are proper subsets of the variables whose 
values can determine the value of the function irrespective of the values of 
the other variables. The goal in minimizing decision tree depth is to discover 
the partial assignments as quickly as possible, while the goal in showing large 
decision tree complexity is to show this is not possible. Define 

,i k ,ei,...,e k : f\ Xn=tlt ...^ k=tk = 1}, 
,i fc ,ei,...,e fc : f\ Xn=ei ^ Xik=Ek = 0}. 

That is, Di(f) is the least k so that from every assignment we can pick k 
variables such that assigning only these k values already forces the function 
to be i. Alternatively, Di(f) corresponds to the non-deterministic decision 
tree complexity of verifying f(x) = i, and max{Do(/), Di(f)} is the non- 
deterministic decision tree complexity of computing /. (A non-deterministic 
computation may be considered as an ordinary computation augmented by 
the power to to make lucky guesses.) 

1.5 D(f) < A)(/)A(/) 

For boolean x let x e denote x if e = and x if e = 1. The representation 

f{x 1 , ...,X n )=\J f\ 

l ieSi 

of / in terms of the disjunction of a number of elementary conjunctions of 
literals^ is called a disjunctive normal form (DNF) of /. literal 

If we can represent / in DNF so that every elementary conjunction has disjunctive normal form 
at most k terms, then D\(f) < k, because if any partial assignment of 
variable forces / to be 1, it must force some elementary conjunction to be 1. 
Conversely, there exists a DNF for / in which every elementary conjunction 
has at most D\(f) terms: for e : /(e) = 1 let S t be the indices of the 



D\(f) = max mini k : 3ii , . . . 

V ' x:f(x) = l 1 

D()(f) = max min-ffc : 3ii, . . . 

UKJJ x:f(x)=0 1 



X A literal is a boolean variable or its negation. 
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minimum set of (at most D±(f)) variables whose assignment Xi = e« forces 
/ to 1. Then 

/(*)= V A**— 

e:f(e)=lieS e 

conjunctive normal form One can similarly correlate Dq and the conjunctive normal form CNF of 

/• 

Next, we show the surprising relation D(f) < Di(f)Do(f). Write / si- 
multaneously in DNF and CNF so that the sizes of the elementary conjunc- 
tions (disjunctions for CNF) do not exceed £>i(/) (Do(f)). To determine 
the value of / on an input x, we use the following strategy. We choose the 
first variable Xi in the first elementary conjunction of the DNF, and query 
its value e«. We then substitute the value for the variable Xj in the DNF 
and the CNF and simplify, obtaining a DNF and CNF for /' = f\ x . =e .. 
Since each elementary conjunction in the new DNF has size at most D\(f), 
£>i(/') < Similarly, D (f) < D (f). 

The crucial observation is that each elementary disjunction in the CNF 
has a variable (in fact a literal) in common with each elementary conjunction 
in the DNF. (Otherwise the variables in the elementary disjunction and the 
elementary conjunction can be simultaneously set to force the function to 
and 1.) Thus by continuing the above process, by the time we have queried 
all of the at most D\(f) variables in the first elementary conjunction, we 
have reduced the size of every elementary disjunction by at least 1. It 
follows that we can query at most the variables in the first Do(f) elementary 
conjunctions before we have determined the value of the function. Thus 
£>(/)< D (f)D 1 (f). 

Recalling the earlier remark about non-determinism, Di(f), and Do(f), 
one might say that the above shows that in this model NP n co — NP = P. 

1.6 The Aanderaa-Karp-Rosenberg Conjecture 

We can represent functions on graphs by encoding the adjacency matrix in 
the input to the function. For an undirected graph with n nodes, we let 
xfj : 1 < i < j < n represent the presence or absence of the edge (i, j) by 
taking the value 1 or respectively. 

In this way we can represent arbitrary functions on graphs. Generally, 
graph properties however, we will restrict our attention to graph properties — boolean func- 

tions whose values are independent of the labeling of the nodes of the graph. 
Technically, / : {xij : 1 < i < j < n} — ► {0, 1} is a graph property if for any 
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II G 5* n ,0 and for any 



/(• • • ...) = /(.. • ^ x n(i)u(j), ■ ■ •)• 



The Aanderaa-Karp-Rosenberg (AKR) Conjecture is that any mono- 
tone^] , non-trivial graph property is evasive. It is known to be true for n a 
prime power, and counter-examples are known if the monotonicity require- 
ment is dropped. 

A generalization of this conjecture follows. F is weakly symmetric if there 
exists a transitive^ group G C S n such that for all g 6 G, /(..., Xj, .. .) = 
/(..., .. .). The generalized conjecture is that any monotone, non- 
trivial, weakly symmetric boolean function is evasive. 

For example, suppose / = "graph G has no isolated node" . First, observe 
that for general /, if #{x £ {0, l} n : f(x) = 1} is odd, then / is evasive. To 
see this, observe that for any Xi the above property is maintained for either 
f\ x= o or f\ x . =1 , so that the adversary can answer queries so as to maintain 
the property as / is restricted. As long as the number of unqueried variables 
is at least 1, the size of the range of the restricted function is even, so the 
property ensures that the function is not constant. 

For the above choice of /, an inclusion/exclusion argument shows 



monotone 



weakly symmetric 
transitive 



#{G : G has no isolated vertex} 



E(-i 

k=0 

(_ ir -l 



'-"^ 2(V)( mod 2) 



n 



-l) n (mod 2). 



Thus provided n is even, an odd number of graphs have no isolated nodes, 
and / is evasive. 

Note for later that we can generalize the above condition. In particular, 
an inductive argument in the same spirit shows: 

Lemma 1.1 



#{x : f{x) = 1}. 



2 S n denotes the symmetric group on n elements, also known as the set of permutations 
of size n 

3 A graph property is monotone if adding edges to the graph preserves the property. 
4 G is transitive if Vi,j3g £ G : g(i) = j. 
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2 Evasiveness, continued 

2.1 Connectivity is Evasive 

If / = "G is connected" , then / is evasive. To see this have the adversary an- 
swer "no" unless that answer would imply that the graph was disconnected, 
in which case she answers "yes". In this way the adversary maintains that 
a spanning tree exists among the "yes" and unqueried edges. If some edge 
has not been queried, can the answer be known? If it is known, it must 
be "yes", and the "yes" edges must contain a spanning tree, so a path of 
"yes" edges connects itoj. Of the edges on this path, suppose the last edge 
queried is («, v). At this point, we have a contradiction, because the the ad- 
versary could have answered "no" to the query of (u, v) while maintaining 
the possible connectedness of the graph through the other "yes" edges and 
edge 

Consideration shows this argument generalizes to any monotone / with 
the property that for any x such that f(x) = 1, and any X{ = 1, we can set 
Xi = 0, possibly setting some other Xj = 1, without changing the value of 
the function. 

2.2 "Tree" Functions are Evasive 

A general class of simple but evasive functions are tree functions — those tree functions 

which have formulas using V and A in which every variable occurs exactly 

once. The adversary has the following strategy. When asked for the value of 

Xi, if Xi occurs in a conjunction (■ ■ ■ A X{ A ■ ■ ■) in the formula, the adversary 

claims x« = 1. Otherwise X{ occurs in a disjunction and the adversary 

responds that x,- L = 0. The adversary plugs the answered value into the 

formula, simplifies it, and continues. In this way, the adversary maintains 

that one variable is removed from the formula with each question, so the 

result can not be known unless every variable has been queried. (Clearly 

the same proof applies if the formula also contains negations.) 

2.3 The AKR Conjecture is True for Prime n 

Previously we proved that if the number of x with f(x) = 1 is odd, then / is 
evasive, and noted that this can be generalized to show that 2 n ~ D ^ divides 
this number. Here is an alternate extension: let \x\ denote the number of 



12 



l's in x. Define 

M/)= E (-i) w - 

Then we can use the property fi(f) = ^(f\ x . =Q ) — m(/L=i) to show that 
if n(f) 7^ then / is evasive. In particular, the adversary maintains that fi 
applied to the restricted function (i.e. / restricted by the partial assignment 
given by the adversary's responses so far) is non-zero, so that the restricted 
function is non-trivial unless all variables have been queried. (The reader 
may want to check the base case of this argument.) 

More generally, define Pf(t) = ^2 X f(x)t^. Then for a constant function 
c of k variables, p c {t) = (1 + t) k , and so an inductive argument similar to 
the above shows 

(t+ir- D ^\ Pf (t). 

Next we use the [i criterion to prove the generalization of the AKR 
conjecture for prime n. A counter-example exists with n = 14 when n is not 
required to be prime. 

Theorem 2.1 // / : {0, l} n -► {0,1} is weakly symmetric, /(0) / f(l), 
and n is prime, then f is evasive. 

Proof: We will show /x(/) = Y, x f(x)(-l)^ / 0. The first part of the 
proof is to use the weak symmetry of / and the primality of n to show that 
there is a permutation consisting of a single cycle leaving / invariant. The 
second is to use this fact and the primality of n to group the inputs yielding 
f(x) = 1 except or 1 into equivalence classes of size n, thus showing that 
H(f) = l(mod n), so that /j,(f) / 0. 

Since / is weakly symmetric, there exists a transitive subgroup T of S n 
leaving / invariant. Consider the partition of F = U± U ■ ■ ■ U U n where g G U{ 
iff g(l) = i. The transitivity of V ensures that each U\ is of the same size, 
so n divides |T|. Since n is prime and n | |T|, Cauchy's theorem implies that 
r contains an element 7 of order n. Since n is prime, such a permutation 
necessarily consists of a single cycle. 

Now (assuming WLOG that /(0) = 0) we partition the inputs x into 
classes such that two elements are in the same class iff one is obtainable 
from the other by rotation (i.e. application of 7). Since 7 leaves / invariant, 
and n is prime, it follows that unless every Xi is the same, each of the n 
possible rotations of x are distinct. Thus the values of x such that f(x) = 1 
can be partitioned into classes of size n, except for x = 1. It follows that 
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the number of such inputs modulo n is 1, so that the number of distinct 
non-zero terms in the expression for \i is 1 modulo n, and /x is not zero, rj 

[Here is a sketch of how to generalize the theorem for n = p a a prime 
power. It is no longer necessarily true that T has a cyclic element, but now 
r has a transitive (sylow) subgroup T' of order p b , with p b but not p b+1 
dividing |T|. 

Again we group the terms of so that two x's are in the same group 
if mapped by V to each other. We look at the orbits of V acting on {0, l} n . 
The number of elements in an orbit divides T' = p b and is not equal to 1 
unless x = or x = 1, and all vectors in the same orbit give the same value 
of/. 

Using this grouping we show //(/) = (—1)™ mod p, so ^ 0.] 
Before we observed that (t + l) n ' D{f) \ p f (t) = E x f(x)t^. We define 

Pf(ti, ■ ■ ■ , t n ) = J2x ' ' ' *n"' an d generalize this observation in the 

next lemma. 

Lemma 2.2 p f G ((t h + 1) • • • (t in _ D{}) + 1) : 1 < i\ < ■ ■ ■ < i n -D(f) < n )f\ idea l 

Proof: First, if / = 0, p f = 0, and if / = 1, p f = J2 X ^l 1 ' ' ' *n n = 
(t\ + 1) • • ■ (t n + 1). If / is not constant, fix a minimum depth decision tree 
for / and use Pf = Pf\ _ + f\ _ 1 to expand pf into a sum of terms, 
each term corresponding to a "yes" leaf of the tree. Each such term is of the 
form (Iligs^ii) x (ll i( --g(ti + 1)^, where Si is the set of indices of variables 
queried and found to be 1, and S is the set of variables queried. rj 

Here is another way to look at this result. Let Z{ = tj + 1, and 

Qf(zi,... ,z n ) = pf(zi - 1, . . . , z n - 1) 

= Y^f{x){ Zl -l)^---[z n -lf- 

x 

X y< X 

5 (- ■ ■) represents the ideal generated by ■ ■ • (the smallest set of polynomials closed under 
subtraction and under multiplication by any polynomial). An equivalent formulation of 
this lemma is 

Pf= ^2 P *i,-,i n -nu) x + !) ■ ' ■ (K-D(f, + !)> 

l<il<"'<i„__D(/)<n 

where the P .. axe integer coefficient polynomials of the ti. 
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x 3 (1+x 4 )(1+x 5 )(1+x 6 ) 



Figure 2: A term of Pf 

= ElE^ 1 ) 1 ^ 1 /^))^ 1 ---^ 

y \x>y J 

(Here the inequality y < x means Vi,yi < Xi.) 

By the previous lemma, the terms in Qf have degree at least n — D(f) 
in the zi. Thus if \y\ < n - D(f), then E^l-l) 1 ^ 1 /^) = 0. The left 
mobius transform hand side of this equality is known as the mobius transform M.f of /. Thus 

we have: 

Corollary 2.3 For \y\ < n - D(f), (Mf)(y) = 0. 
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3 Non-Evasive Monotone Properties Give Contractable 
Complexes 

In the previous lecture, we showed that every non-trivial weakly symmetric 
function on p k variables for prime p is evasive. 

In this lecture we continue our study of evasiveness, introducing some 
topological concepts related to simplicial complexes. We show that the 
simplicial complex associated with a non-evasive monotone function is con- 
tractable. This is the first part of a technique due to Kahn, Saks, and 
Sturtevant; our goal is to prove that all non-trivial, monotone, bipartite 
graph properties^ and non-trivial, monotone graph properties of graphs with 
a prime power number of nodes are evasive. 



3.1 Simplicial Complexes 

A simplicial complex is a finite collection JC of sets such that 

1. VA G JC, Y C A Y G JC, and 

2. K±%. 

V(JC), the vertices of JC, consists of the elements of the sets in JC. 

Corresponding to JC one can construct a geometric realization JC C 
]rV(/C)_ First 

one defines the mapping ~ : V(JC) -» so that no ver- 

tex is mapped into the affine hullQ of any other subset of the vertices (for 
instance, one maps the vertices to the unit vectors). Then one extends^ to 
any set A of vertices by X = conv{v : v G X}, and to any collection C 
(such as JC) of sets by C = UxecA. 

Note that for any X G JC, X is a simplex — the convex hull of a set 
of vectors none of which lies in the affine hull of any subset of the others. 
(Such a set of vectors is said to be affinely independent.) 

A collection JC = {Si,...,S m } of simplices in IR N is said to form a 
geometric simplicial complex if: 



1. VSi G JC, T a facef] of Si 



' T G JC, and 

i e V,j 



A bipartite graph property f(xij 
permutations of the edges induced by permutations of V and W. See " 

affine{S} ' = {E vS s 



conv{S} = {E„ 6 



J2&v = l,a v > 0}. 



6 W) is a boolean function invariant under 
graph property". 
a v v : ^2 a v = l}; 



simplicial complex 



V(K) 



amne{£'}, conv{5'} 



simplex 



affinely independent 



geometric 
complex 

face 



simplicial 



S A face of a simplex S = conv{V} is a set S' = conv{V} : V C V. 
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2. ViSj, Sj G JC, Si PI <Sj 7^ =>- n Sj is a face of both Si and Sj. 

polyhedron A polyhedron is then defined as the union of the sets in any geomet- 

ric simplicial complex. Note that such an entity is not necessarily convex, 
for instance the surface of an octahedron is a polyhedron formed by the 
geometric simplicial complex consisting of its faces, edges, and vertices. 

3.2 Contractability 

contractable Intuitively, a set T C IR N is contractable if it can be continuously shrunk to a 

single point, while never breaking through its original boundary. Technically, 
T is contractable if there exists a continuous mapping $:Tx[0,l] -»T 
with \/x G T, <&(#, 0) = x, <£(x, 1) = po for some po G T. One can show that 
the choice of po is immaterial. 

If T consists of 2 distinct points in IR 1 , no such mapping can exist because 
at some time the mapping would have to switch from mapping a point to 
itself to mapping the point to the other point. 

If the underlying simplicial complex is a graph, if the graph is discon- 
nected one can similarly show that the set is not contractable. Similarly, 
if the graph has a cycle, at any time the cycle will be in the image of the 
mapping, so a cyclic graph is not contractable. 

Conversely, if the graph is a tree, then one can contract the graph by 
repeatedly contracting the edges leading to leaves. 

(Note that we consider contractability of a simplicial complex synony- 
mous with the contractability of its geometric realizations.) 

Generalizing the contraction of a tree described above, we will obtain 
a useful sufficient condition for contractability. (Surprisingly, for a general 
simplicial complex /C, it is undecidable whether K, is contractable.) For 
v G V(K), define 

K\v = {X G K : v X} 
JC/v = {X G K : v X, X U {v} G £}. 

The first is called K, minus v; the second is called link of v in /C. 

Considering the boundary of the 3 dimensional simplex, dA%, which is 
not contractable, one sees that 0A^\v = A2 is contractable but dA^/v, a 
three node cycle, is not. 

Lemma 3.1 If for some v, K/v and K\v are contractable, then K. is con- 
tractable. 



K, minus v 
link of v in JC 
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Proof: Let C denote {X G K : u G X}, so that £ = C U £\u. 

The first step in contracting K, is to use the contractability of fc/v to 
construct a mapping \& which contracts^] C onto IC/v, leaving IC\v fixed. 
Once this is accomplished, all of the points have been contracted into fC\v, 
so applying the contraction of fC\v completes the contraction of IC. 

Suppose IC/v is contracted by $ to po. Denote a point p in C by (p', A), 
where p' <E JC/v and AG [0, 1] such that p = Xp' + (1 — X)v. (Note that this 
denotation is continuous and invertible except at v.) 

Let C\ = {{p' , A) : p' G IC/v}, so Ci = IC/v and Co = {w}. 

The contraction can be envisioned as flattening C. At time t, each Ca 
for A < t will have been flattened into Ct, until all of C is flattened into IC/v. 
Once Ca is mapped onto Ct, as t grows, instead of letting the image of C\ 
grow with Ct (which would lead to a discontinuity at v), we contract it using 
to counteract the growth. 



*(P,*) 



($(p',l- X/t),t) if t> A, and 
p if t < A. 



If t = or A = 1 then ^ is the identity. If t = 1 all points are mapped 
into IC/v. We leave it to the reader to verify the continuity of remarking 
only that as p — > v, X — > 0, so <&(p', A) — > po, independent of p'. rj 



3.3 Monotone Functions 

A monotone boolean function / ^ 1 gives a simplicial complex 

IC f = {sc{l,...,n}:f(x s )=0} 

in a natural way, and vice versa.f^] Also, 

^/| _ n = {S Q {l,...,i-l,i + l,...,n} : S eKf} = KM, 

£ f| = {5C {l,...,i-l,z + l,...,n} : S U {i} G /C/} = /C//i. 

By now we may begin to suspect a relation between non-evasiveness and 
contractability. We prove such a relation in the next lemma. 

9 We generalize the notion of contraction to a point in the natural way to allow con- 
traction to arbitrary contractable subsets. 
io ( T s\ _ f 1 ies, 
v 'i 10 i S. 
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Lemma 3.2 (Kahn-Saks-Sturtevant) If f ^ 1 is non-evasive, then K,f 
is contractable. 

Proof: Assume / : {0, l} n — > {0, 1} is non-evasive, with / ^ 1. 

If n > 1, then there exists an i such that /| x an< ^ /Lj=i are non ~ 
evasive. Provided /| Xj=1 ^ 1, we can assume by induction that K, f\ _ o and 
/C f\ _ are contractable. By the preceding lemma and remarks, it follows 
that ICf is contractable. 

If n > 1 and f\ x . =1 = 1, then /Cj = /C j| , and f\ x . =0 is non-evasive, 
so again by induction Kf is contractible. 

Otherwise n = 1, so / = and /C/ = {0, {1}}, which is contractible. 

□ 

We have now established a link between evasiveness of monotone func- 
tions and the contractability of the associated simplicial complex. In the 
next lecture, we will use the symmetry properties of monotone graph prop- 
erties and some more topology to show in some cases that the associated 
complexes are not contractable, and thus that the original functions are 
evasive. 
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4 Fixed Points of Simplicial Maps Show Evasive- 
ness 

In the previous lecture we showed that the simplicial complex associated 
with a monotone non-evasive function is contractable. In this lecture we 
present the following argument. 

Standard fixed point theorems in topology tell us that a continuous func- 
tion mapping a contractable polyhedron into itself has a fixed point. On the 
other hand, the invariance of a monotone function / under a permutation n 
of the inputs implies that the geometric realization of the permutation maps 
JCf into itself, and thus has a fixed point if / is non-evasive. For / a mono- 
tone bipartite graph property or a monotone graph property on graphs with 
a prime power number of nodes, we characterize the possible fixed point sets 
of such mappings to show that if / is non-trivial, no fixed point can exist, 
so that / is evasive. 

4.1 Fixed Points of Simplicial Mappings 

Suppose fC and K! are (abstract) simplicial complexes. Then ip : V(K) — ► 

VQC') is a simplicial map provided VX E /C <p(X) E /C', that is, provided simplicial map 

(p preserves the property of being in the complex when applied to sets. Such 

a map yields a continuous linear map tp : 1C — » K! by mapping the vertices 

of /C in correspondence with </?, and mapping convex combinations of the 

vertices to the corresponding convex combinations of their images: 

(p I J2a v v) = J2 

\vt=K. / veK 

Note that for x E K, the representation x = J2veK. a vV '■ J2v a v = 1 is 
unique. The set A x = {v : a v / 0}, a simplex of /C, is called the support 
simplex of x, and is, of course, also unique. support simplex 

So given a simplicial map (p : K, — * 1C, which for our purposes we will 
assume is one-to-one, what are thejixed points? Suppose a; is a fixed point 
with support simplex A. Then 95(A) contains x, and hence contains A. 
Since y(A) and A are the same size, it follows that y(A) = A, that is (p 
permutes the vertices in A. This in turn implies that the center of gravityP"] center of gravity 
of A is also fixed by (p. 

11 The center of gravity of a face H is ^2 veH v/\H\. 
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Figure 5: Fixed Points of Simplicial Maps of A3 

orbit What are the other fixed points in A? If the orbits^ of the permutation 

induced by (p on A are Hi, . . . , H^, then Hi is a face of A, with the center 
of gravity a fixed point. Also, any convex combinations of these centers of 
gravity is a fixed point. 

Conversely, if x = X^eA a vV is a fixed point, then x = (p{x) = J2 V ^A a v^{v) 
is also a representation of x, and because the representation of x in this 
way is unique, there exists a permutation ir of the vertices in A such that 
a ir(v)' K {v) = a v ip(v), that is, (p(v) = n(v) and a v = a w r v y It follows that 
a v = aw v ), so that for each orbit Hi of tp on A, we can choose Pi so that 
u G Hi a u = Pi. Thus we have 

x = = ^2 = ^2 Pi\H\wi. 

DgA i v£Hi i 

In other words, x is a convex combination of the centers of mass of the faces 
corresponding to the orbits. 

To view this from a more combinatorial perspective, suppose the orbits 
of ip on the vertices of 1C are Hi, . . . , Hn, and assume that the first t of 
these are those which are also simplices of K,. Let Wi : 1 < i < t denote the 
center of gravity of Hi. Then each Wi is a fixed point, and any proper convex 
combination x of a subset {w^ , . . . , Wi r } C {wi, . . . , wt} is also a fixed point, 
provided only that the x is in fact in K,, that is, provided H; L1 U • • - UHi r £ fC. 

In sum, if fix^) denotes the fixed points of (p, then fix(y?) = 7i, where 

W = {{*!,..., ir} :H h U---UH ir £JC}, 

and the vertices V{7i) of the simplicial complex TC are {l,...,t}, with i 
taken to be the center of gravity of the face Hi. 



The orbits, or cycles, of a permutation are the minimal sets of elements such that the 
permutation takes no element out of its set. 
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4.2 Fixed Point Theorems 

Next we give some theorems which give sufficient conditions for the existence 
of fixed points, and which characterize some useful properties of the fixed 
point sets. The first is Brouwer's fixed point theorem. 

Theorem 4.1 (Brouwer) Any continuous map of a simplex to itself has 
a fixed point. 

An alternate formulation of this theorem follows. 

Theorem 4.2 There does not exist a continuous map from a simplex to its 
boundary leaving the boundary fixed. 

If there were a continuous function / : S — > S with no fixed point, we 
could construct a function g : S — > dS leaving dS fixed as follows. Given 
x E S, obtain g(x) by projecting a ray from the point f(x) through the 
point x to the boundary dS. 

Note that for any polyhedron S, if dS is not contractable, then there can 
not exist a continuous map from S to its boundary leaving the boundary 
fixed. 

[Also recall our earlier claim that a set is contractable iff the cond^] cone 
formed by the set with an affinely independent point v can be mapped 
continuously to the set.] 

Theorem 4.3 (Lefshetz) If K, is contractable, then any continuous map 
from fC to itself has a fixed point. 

4.3 Application to Graph Properties 

We are finally in a position to apply these techniques to show evasiveness. 
Recall the previous result that a function / : {0, l} n — ► {0, 1} invariant 
under some cyclic permutation of its inputs and with /(0) / /(l), is evasive, 
provided the number of inputs is prime. To start, we show that any non- 
trivial monotone function invariant under a cyclic permutation of the inputs 
is evasive. 

Lemma 4.4 Suppose f : {0, l} n — ► {0, 1} is a monotone, non-trivial func- 
tion invariant under a cyclic permutation of its inputs. Then f is evasive. 

13 The cone of a set with a vertex consists of the convex combinations of v with points 
in the set. 
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Proof: Assume without loss of generality that / is invariant under the 
permutation ip(i) = i + 1 mod n, and assume / is non-evasive. Consider ICf. 
As shown in the previous lecture, ICf is contractable. Since / is invariant 
under ip, <p is a simplicial map ofKt, so that tp has a fixed point (by Lefshetz' 
theorem). 

As discussed in the beginning of this lecture, the fixed points correspond 
to orbits of (p contained in ICf. Since the only orbit of <p is {1, 2, ... , n}, this 
set must be in Kf. Thus /(l, 1, . . . , 1) =0, a contradiction. rj 

The success of this technique hinges on our being able to characterize 
the orbits, and hence the fixed point set, of a permutation under which the 
function is invariant. The proof of the next theorem is essentially the same 
as the previous, except that the characterization of the orbits is trickier. 
Before we give the theorem, we give the Hopf index formula, an extension 
of Lefshetz' fixed point theorem which we need for the proof. 

Theorem 4.5 (Hopf Index Formula) For ip a simplicial one-to-one map- 
Euler characteristic ping of K, a contractable simplicial complex, the Euler characteristic^ of the 

fixed point set TL of (p is -1. 

Theorem 4.6 (Yao) Non-trivial monotone bipartite graph properties are 
evasive. 

Proof: Let f(xij : i £ U,j £ W) be a non-evasive monotone bipartite 
graph property. Let tp be a permutation of the edges corresponding to a 
cyclic permutation of the vertices of W while leaving U fixed. 
The fixed point set of (p on ICf is characterized by: 

fix(<^) = H, R 1; . . . , u ir } G7i^ /(xK.-« r }^) = o 

with the vertices of TL being the centers of gravity of the faces corresponding 
to the orbits of (p. 

The orbits of ip correspond to the nodes of U : each orbit contains all of 
the edges touching a single node of U, so we identify each vertex of TL with 
a node of U. Since Kf is contractable, there are fixed points, so some edge 

14 The Euler characteristic x(^) °f a polyhedron K, is defined by xCQ ~ 
X/zgjc rc^^" 1 )' 1 ' • (R eca U A*(/)0 The Euler characteristic is invariant under topolog- 
ical deformation, and is thus useful for classifying topological types. For instance, a 
contractable set has Euler characteristic -1. 
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set {u} x W has f(x^ xW ) = 0.0 By the symmetry of /, therefore, any 
choice of u yields f(x^ xW ) = 0. The sets in TC correspond to edge sets 
{ui 1 , ... ,m r } x W, and again by symmetry either all or none of these sets 
for any given r are in H. By monotonicity, then, TC is characterized by 

{u h ,. . .,u ir } GTi^r <r 

for some tq. The Euler characteristic of H is thus 

, f\u\\ , , , W) f\u\\ 

to) 

\u\-i s 





By the Hopf formula, this equals -1, which implies that r$ = \U\, i.e. U G H, 
soUxWelCf, and f(x UxW ) = 0. Thus/ = 0. q 

One might expect the proof to be simpler for a general graph property, 
which is invariant under a larger class of permutations. Unfortunately, since 
a general graph has more edges, the orbits of any given permutation are 
generally more complicated. However, when the number of nodes is a prime 
power, we can still characterize the orbits, and thus show evasiveness. Before 
we show this, we give a more general fixed point theorem. 

Theorem 4.7 Let T be a group of mappings of a contractable geometric 

complex K, onto itself. Let Fi be a normal subgroup^ of T with \Ti\ = p k , normal subgroup 

for a prime p, and with T/Ti cyclic. Then there exists an x G K, such that 

vv g r, x = (p{x). 

Note that we are no longer talking about the fixed points of a single 
simplicial mapping, but rather the fixed points of a group of simplicial map- 
pings. If the orbits^] of T are Hi, . . . , Hn, with the first t of these in JC, and 
Wi : 1 < i < t is the center of gravity of Hi, then fix(r) = H, where: 

{h,...,i r } G H & H h U • • • U H ir G K, 

15 Considered as acting on the complete simplex Aim x \w\ , V necessarily has fixed points. 

The question is whether any of these fixed points are in K.f. 
16 A subgroup Ti of V is normal if Vx G T^Txx' 1 = IY 

17 The orbits of a collection of permutations are the minimal sets invariant under every 
permutation. 
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and ViTL) = {1, . . . , t}, with i = Wi. 

To see this, note first that for any ip G T, an orbit of T is expressible as a 
disjoint union of orbits of ip, so any Wi is expressible as a convex combination 
of centers of mass of the faces corresponding to orbits of p, so Wi is fixed by 
ip. Thus any convex combination of the Wi is fixed by each ip. 

Conversely, if x = J2 V ^A a vV is a fixed point of T, for any u, w in an orbit 
Hi of T there is a ip such that ip(u) = w and x = (p(x) = J2veA ^^(v), so 
the uniqueness of the representation of x implies a u = a w . Consequently 
we can choose Pi, . . . , Pt such that \/i,u G Hi => a u = Pi, and 

x=^a v v= ^2^= Pi\ H i\ w i- 

VGA H,CAv&Hi HiCA 

Thus the previous techniques continue to apply when we have a group 
of simplicial mappings. The previous theorem is exactly what we need for 
the proof of the next theorem: 

Theorem 4.8 Suppose f is a non-trivial monotone graph property on graphs 
with a prime power p k number of nodes. Then f is evasive. 

GF(p k ) Proof: Think of the nodes of our graph G as identified with GF(p fc )|^| 

Consider the linear mappings x \— ► ax + b : GF(p k ) — > GF(p k ) as a group 
r. Let Ti be the mappings x i— > x + b, so |Tx| = p k - The normality of Ti 
follows from (((ax + b) + b') — b)/a = x + b/a. The factor group T/Ti is 
isomorphic to the group of mappings x \—* ax : a ^ 0, i.e. the multiplicative 

multiplicative group groupPI of GF(p k ) — {0}, which is known to be cyclic. Thus the preceding 

theorem applies, and every action of T has a fixed point on fCf. Since V is 
transitive on the edges, the only orbit of V consists of all of the edges. Thus 
if / is non-evasive, so that /C/ is contractable and T has a fixed point, then 
Kf must have as an element the set of all edges, and / = 0. □ 



18 GF (p k ) is the Galois field of order p k . For k = 1 this is the field of integers 0, 1, . . . , p — 1 
under arithmetic modulo p. For larger k, it is the field of polynomials over GF(p) modulo 
an irreducible polynomial of order k. 

19 The multiplicative group of a field is the group formed on the elements other than 
under multiplication. 
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5 Non-Deterministic and Randomized Decision Trees 

In this lecture we use previous results to show a @(n 2 ) lower bound on the 
decision tree complexity of a general non-trivial monotone graph property, 
and we begin discussion of decision trees for probabilistic algorithms. 



5.1 Near Evasiveness of Monotone Graph Properties 

The key lemma in our study so far of the decision tree complexity of mono- 
tone Boolean functions has been : 

If f : {0, l} n — > {0, 1} is a non-evasive, monotone function, with /(0) = 
0, thenJCf is contractible. 

We also noted that : 

If ' ICf is contractible then x(^f) = — 1 (i- e - J2s:f{S)=o(~ = ®)- 
We would like to extend this theory to general (non-monotone) functions 
as well. However, ICf, as defined, is not a simplicial complex in the case of 
a non-monotone function /. Although the AKR conjecture is known to be 
false for non-monotone functions, if / is weakly symmetric, /(0) 7^ /(!) = 1, 
and the number of variables of / is prime, we have shown that / is evasive. 



(Theorem 2.1). 

We have seen that monotone graph properties on graphs with a prime 
power number of nodes (theorem |4.8| ), or on bipartite graphs (theorem |4.6| ), 
are evasive. Next we show that for any non-trivial monotone graph prop- 
erty, by restricting the property to some O(n) size subgraph, we can obtain 
one of these two kinds of properties. Since D(f) is at least D(f\ R ) for 
any restriction R, this will imply that any non-trivial monotone property / 
D(f) = Q(n). 

Theorem 5.1 Let f be a monotone, non-trivial graph property, thenD(f) > 
cn 2 for some positive constant c. 



Proof: Let G be a graph on n nodes. Choose a prime p such that n/2 < 
p < 2n/3 (it follows from number theoretic arguments that such a prime 
exists). Let S be a subset of the nodes of G such that |5| = p. Let K$ 
denote the complete graph on S, with all other {n — p) nodes being isolated. 
Since / is monotone, and non-trivial, /(0) = 0. Now, there are several cases: 

Case 1, f(K s ) = 1. Let R be the restriction x itj = : (i,j) S. Then /(G) 
20 /(G) for a graph G = (E, V) is shorthand for f{X E ). 
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f\ R is a monotone, non-trivial graph property on a graph with a prime 
number of nodes, and 

D(f)>D(f\ R )= Q >(n 2 -n)/8. 

Case 2, f(Ks) = 0. In this case f(K v \ s ) = 0, since lfy\s is a complete 
graph that is smaller than K$ (n/2 < p), and / is monotone. 

Let H = K v \ s U {S x (V r \S')} (Note the abuse of notation here, as we 
are really interested in unordered pairs). 

Case 2.1, f(H) = 1. Let R = {x itj = : i,j £ S}U{x itj = 1 : (i, j) G 
S\V}. Then f\ R (0) = f(V\S) = 0, and f\ R (l) = /(F) = 1, 
and f\ R is a monotone bipartite graph property on a graph with 
p{n-p) edges. Thus D(f) > D(f\ R ) = p(n - p) > 2n 2 /9. 

Case 2.2, f(H) = 0. Let R = {x itj = 1 : (i, j) £ H}. Then f\ R (0) = 
f(H) = 0, f\ R l = /(l), and f\ R is a monotone graph property 
on the subgraph induced by the vertices in S, so as in case 1 
D(f) > (n 2 - n)/8. 



□ 



5.2 Non-Deterministic Decision Trees 

Recall that we defined D (f), and Di(f), and showed that D(f) < D (/)I>i(/). 

Theorem 5.2 (Babai or Nisan?) Suppose f is weakly symmetric (invari- 
ant under a transitive group T), then Do(f)Di(f) > n. 

Proof: Recall D (/) = mm{k : f = E x A E 2 . . . A E N , = x J V ... V } . 

Similarly, £>i(/) = min{/c : / = F X VF 2 . . .VF M , Fi = xj 1 A. . -Ax^ fe }. Recall; 
there must be a variable, Xj, that occurs in both and F\ (otherwise, we 
can force the function value to be 0, and 1, at the same time). Let 7 G T. 
Let El be i% after the action of 7. Since / is invariant under T, we can 
rewrite/ = E\ . . . Ejj. Therefore Ej must have a variable in common with 

The crucial observation at this point is that for a transitive group T of 
mappings on a set S, the quantity q = #{7 G T : 7(x) = y} is independent of 
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x and y. This is because for any x, y, and y' we can map {7 G T : 7(x) = y} 
1-1 into {7 £ T : 7(2;) = y'} by composing any fixed map 7' : 7(2/) = y' 
with the maps in the first set. This shows independence of y and a similar 
argument shows independence of x. 

In fact, we can determine q by the equation (for fixed xo) 

i n = H #{7 e r : 7 (x ) = y} = |r|, 

so q = |T|/n. 

Returning to the original argument, that Ej has a variable in common 
with Fi for every 7 means that every 7 maps something from E\ to some- 
thing in Fi, i.e. there are |T| 7 mapping some x from £1 to some y from 
Fi. Since any given pair x G Fi and y G Fi (again abusing notation) has 
at most g = |r|/n 7's mapping x to y, it follows that there are at least 
\T\/q = n pairs (x,y) with x G Fi and y G Fi, i.e. |Fi||Fi| > n. 

Recalling that \E\\ < Do(f) and |Fi| < D\(f) finishes the argument, rj 



5.3 Randomized Decision Trees 

The general question in complexity of randomized algorithms is "Does the 
ability to flip a coin add computational power?" Over the past twenty years 
we have learned that the answer is a definitive yes. Generally, randomization 
may give an algorithm the ability to avoid a few bad computation paths, 
and thus better its worst case behavior. 

From the decision tree model, there are a number of ways to model ran- 
domized algorithms. One is that at each node, rather than definitely query- 
ing some variable, we choose which variable to query randomly according to 
a probability distribution dependent on the node and the previous results of 
random choices. For a given input the number of input bits queried is then 
a random variable, and the decision tree complexity is the maximum over 
all inputs of the expected value of the number of input bits queried. 

An alternate model is that the algorithm makes all random choices in 
advance, and from that point on is deterministic. In this model an algorithm 
for deciding a property is specified as a probability distribution over all 
possible decision trees for the property. For a given tree T, if 5(x, T) denotes 
the number of input bits queried for a given input x G {0,1}™, and pr 
denotes the probability the algorithm choosing T, then the decision tree 
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Figure 6: A Function with Randomized Complexity o(n). 



complexity of the algorithm is 



max 



X 



J2ptS(x,T). 



T 



This will be made more concrete by the following example. The example 
is due to Saks, Snir, and Wigderson, and gives a tree formula which has 
randomized decision tree complexity o(n). (We have seen previously that 
all tree formulae have (deterministic) decision tree complexity n.) 

The function : {0, 1}™ — ► {0, 1}, n = 2 k , is a tree formula. (I.e. it is 
defined by a formula in which each variable occurs exactly once.) We may 
define fa inductively by 



That is, we take a balanced binary tree and construct a formula by labeling 
the leaves with the variables and labeling internal nodes of the tree alter- 
nately with "and" and "or" gates as we go up the tree. 

We saw previously that tree formulae are evasive. The evasiveness of / 
also follows by the weak symmetry of / and the fact that the number of 
variables is a prime power. Our task is to construct a randomized algorithm 
for / with expected decision tree complexity o(n). 

First, for convenience, we get rid of the asymmetry at different levels 
by replacing the and-gates and or-gates by nand-gates (negated and-gates). 
Specifically, we use (fa V fa) A (/ 3 V fa) = (/iA/2)A(/ 3 A/ 4 ) to replace all 
gates except possibly the gate at the root with nand-gates. If the gate at the 



fo(xi) = 




f k (x!, ...,x n/2 ) A fk{x n / 2 +i, -,x n ) k odd, 
fk{xi, -,x n / 2 ) V fk{x n /2+i, -,x n ) k even. 



5 — Non-Deterministic and Randomized Decision Trees 



31 



root does not become a nand-gate, it is an and-gate, and we simply negate 
it. This complements /, but doesn't change the complexity of the function. 

Now that we have nand- gates at all the nodes, consider the evaluation 
of the function. If for some nand-gate, we know one input is 0, we know 
the gate outputs 1, independent of the other input. Thus our randomized 
strategy will be to start at the root, choose one of the two inputs uniformly 
at random to evaluate, and recursively evaluate it. If it returns we return 
1, otherwise we evaluate the other input recursively, returning 1 if the other 
input returns 0, and 1 otherwise. 

Let cik denote the expected number of variables checked to compute / if 
f(x) = 0, and let bk denote the expected number if f(x) = 1. 

If f(x) = 0, then both inputs must be evaluated and are 1. If f(x) = 1, 
then either both inputs are 0, in which case we definitely only evaluate one 
input, or one input is 0, in which case we have at least a one in two chance 
to evaluate only one input. This yields 



2a 



a k < max{6 fc _i,-6 fe _i + -(afe_i + 6 fc _i)} 



We can write this as 





To estimate and bk from such a recurrence relation, we can examine 
the eigenvalues of the matrix. The action of the matrix on an eigenvector 
(by definition) is just to stretch the vector by the corresponding eigenvalue. 
Thus on repeated application of the matrix, the norm of the eigenvector 
with largest eigenvalue will grow most rapidly. For a vector which is not 
an eigenvector, we can consider it as a convex combination of eigenvectors 
and similarly show that with repeated application of the matrix, its norm 
grows no faster than that of the eigenvector with largest eigenvalue. These 
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considerations show that if there is a single largest eigenvector A, then after k 
applications of the matrix the resulting vector has norm X k + o(X k ) times the 
norm of the original vector. Thus a& and bk are approximately A fc . For the 

above matrix, the eigenvalues are \{1 ± a/33), so that a^, m ^ ( 1=fc ^33) ^ _ 
For our randomized algorithm, k = log 2 n, so we have 




log 2 (^^) 



ak,bk ~ I ^ ' ) =n°' i V 4 7 rj n - 754 = o(n), 



and we are done. (See the end of this lecture for details of the calculation.) 

One might suspect that one could improve this algorithm by sampling 
the inputs randomly and using the result to bias the choice of which input 
to evaluate first towards the input with more l's in the subtree. It turns 
out this doesn't help; in fact the above algorithm is essentially optimal. 
(Although we don't give the proof.) 

In the next lecture we will begin to study the probabilistic version of the 
AKR-conjecture, which is that for a graph of n nodes the expected decision 
tree complexity of a randomized algorithm for a non-trivial graph property 
is 6(n 2 ). A lower bound of n follows from Dji(f) > Do(f)Di(f) > n. Yao 
improved this lower bound to n log n and introduced some techniques which 
we will study. Valerie King improved the bound to 0(n 5 / 4 ) in her thesis, 
and subsequently Hajnal improved the bound to 0(n 4 / 3 ). 

One question which arises is the complexity of probabilistic algorithms 
which are allowed a small probability of error. (Called a Las Vegas algo- 
Las Vegas algorithm rithm.) One can gain a little bit, for instance consider the majority function, 

which is 1 provided a majority of its inputs are 1. By random sampling half 
the inputs, say, one can compute the majority function with a small proba- 
bility of error. On the other hand, there is a lower bound on the complexity 



of y/D{f), which leaves a large gap. 



5.3.1 Details of Calculating and b^ 

Following are the calculations of a& and b^ in more detail; they may be 

( 1 1 \ 

skipped by anyone familiar with linear algebra. Let M = I ^ I The 

eigenvalues eigenvalues of M are the values A 7^ such that there exists a corresponding 

eigenvector eigenvector x with Mx = Xx. Rewriting this as (M — XI)x = 0, we see 

that the eigenvalues are those values for which M — XI is singular, i.e. has 

determinant \M — XI \ =0. 
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o, 



A 2 - -A - 2 
2 

A 



(1±V33). 



Once we have the eigenvalues Ai and A2, let X\ and xi be corresponding 
eigenvectors, and define the matrices 



X - ( 



Ai 
A 2 

X\ X2 



)• 



It is easy to verify that MX = ID, so M = XDX' 1 , M 2 = XD 2 X~\ 
and M k = XD k X~ l . Since 



we have 




After one determines the eigenvectors x\ and X2, it is an easy matter to 
complete this and give a closed form for and b^. 



(Notes by Sigal Ar and Neal Young.) 
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6 Lower Bounds on Randomized Decision Trees 

We begin this lecture with a proof of a basic result, Farkas' lemma. The 
lemma gives a necessary and sufficient condition for the existence of a so- 
lution to a set of linear inequalities. We then discuss (and prove) von Neu- 
mann's min-max theorem. The theorem gives some insight into the advan- 
tages of randomization. We then present some techniques developed by Yao 
applying the min-max theorem to give lower bounds on randomized decision 
tree complexity. 

6.1 Farkas' Lemma 

For a system of linear equalities, everyone knows necessary and sufficient 
conditions for solvability. Farkas' lemma is the analogue for systems of linear 
inequalities. Consider the problem "Does a system of linear inequalities 

m 

Y ai i x i < bi (i = l,...,n) (1) 

3=1 

have a solution?" Roughly, we expect this problem to be in NP, since we 
can exhibit an easy proof (an assignment of x) if it is. Farkas' lemma implies 
that it is also in co-NP, that is, if it is not solvable there is also an easy proof 
that it isn't. 

If the system is not solvable, we will prove it by exhibiting Aj satisfying 
Ysi^aij = (J = l,...,m), Y^i^ibi < 0, and Aj > (i = l,...,n). This 
provides a proof that no x satisfies @, because if such an x existed we 
would have 

= Y x 3 Y Xiai i = Y ^ Y ai i x i - Y ^ ihi < °- 

j i i j i 

Lemma 6.1 (Farkas) For any a%j andbi, (j = l,...,m, i = l,...,n) 

Y ai i x i- bi (* = 1 ) •••)») ( 2 ) 

3 

has a solution x if and only if 

Y X i a v = C? = !' •••' m ) 

i 

Y X * b i < ( 3 ) 

i 

Xi > (i = l,...,n) 
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has no solution A. 

Proof: Consider the vectors (a^i, ctj jm , (i = l,...,n). Consider the 
cone of these vectors. (||) says exactly that y* = (0, ...,0, — 1) is in this 
cone. Suppose that (||) has no solution, so y* is not in the cone. Then 
there exists a separating hyperplane H = {y : J2j Vjhj = 0} such that y* 
is on one side of H and the cone is on the other, i.e. ^2jyjhj < and 
anh-i + • • • + ai n h n + bih n+ \ > (i = 1, ...,m). The first condition says 
that /i n +i > 0, and the second says that J2j a ijhj /h n +i > —h (i = 1, n). 
Thus letting xj = — h hj , we have a solution to ^. rj 



6.2 Von Neumann's Min-Max Theorem 

Recall our second characterization of a randomized algorithm via decision 
trees. For a function / : {0,1}™ — > {0,1}, on an input x, an algorithm 
A chooses a deterministic algorithm for / with decision tree T according 
to some probability distribution p (independent of x\), and then runs the 
deterministic algorithm on x. 

On a given input x, having chosen a particular tree T, A (deterministi- 
cally) takes some complexity 5(T,x) to compute f(x). The expected com- 
plexity for A on x is given by J2t Pt8(T, x). We are interested in the worst 
case expected complexity for A (j.e. the adversary chooses x to maximize 
the complexity): max x J2tPt$(T, x). Finally, if A is optimal, it minimizes 
this worst-case complexity, and thus takes complexity 

D R (f) = minmax^p T 5(T,x). 

p x T 

We can view this process as a game, in which we choose p (determining 
A), and then the adversary, knowing our choice, chooses x. To play the 
game, we run A on x. Our goal is to minimize the expected complexity; the 
adversary's goal is to maximize it. 
zero-sum game For this situation (called a zero-sum game, since we lose exactly what 

our opponent gains), there is a general theorem, von Neumann's min-max 
theorem. We have a game, defined by a matrix Af, for two players — the 
row player and the column player. The game is played as follows. The 
column player chooses a column c and the row player chooses a row r. The 
column player then pays M rc to the row player. 
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Row 
Player 



r=1 



r=2 



Column Player 
c=1 c=2 



1 











1 



Figure 7: A Simple Zero-Sum Game 



The column player wants to minimize M rc , and the row player wishes to 
maximize it. 

To make this concrete, assume we are playing a game (see figure |6.2| ) 
where each player chooses either 1 or 2. If we choose the same as the 
adversary, we pay her 1. Otherwise we pay nothing. What should our 
strategy be? Suppose our strategy is to choose 1. After playing the game 
several times, with the adversary beating us every time, we begin to suspect 
that the adversary knows our strategy and has used that knowledge to beat 
us. What can we do, given that the adversary may know our strategy and 
use that information to try to beat us? 

Von Neumman's key observation is that we can use randomization to 
negate the adversary's advantage in knowing our strategy. For our simple 
game, if our strategy is to choose 1 or 2 randomly, each with probability 
1/2, then no matter what strategy the adversary picks, we have an expected 
loss of at most 1/2. 

More generally, in any zero-sum game we have a randomized (also called 
a mixed) strategy which negates the adversary's advantage in knowing our 
strategy. To make precise the notion of "negating the advantage," we con- 
sider turning the tables, so that she chooses her strategy first, and then we 
choose ours. Then provided we choose the randomized strategy that negates 
her advantage in the first situation, and she chooses the randomized strategy 
which negates our advantage in the second situation, we will expect to do 
just as well in the first situation as the second. Formally, 



Theorem 6.2 (Von Neumann's min-max theorem) For any zero-sum 
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game M , 



min max q Mp = max min q Mp. 

pi i p 



(Note that p and q range over all probability distributions of columns and 
rows, respectively.) 

We start with some observations. In the min-max theorem, in the inner 
max and min, randomization is not important. That is, (if ej denotes the 
jth unit vector in a vector space implicit in the context) 

Vp, m&xp T Mp = m&xej Mp, 

q j J 

and similarly for the inner term on the right hand side. This is because, for 
a fixed p, q T Mp is a linear function of q, and thus is maximized at one of 
the vertices ej of the probability space. 
Proof: Obviously, 



Vgo, Po, max q Mpo > q n Mpo > min q n Mp. 

q u p 



Thus 



min max q Mpo > max min g n Mp. 

po q go p 

The other direction is not so easy. We suppose that 3t : 

Vp : (p > 0, Pj = l ) max ejMp > t 

j 3 

and we want to show that 

3q : q>0,^qj = l,\/i,q T Mei > t. 
j 

This will follow almost immediately from Farkas' lemma. 
Suppose that /3q : q > 0,J2jQj = l,Vi,g T Mei > t. Farkas' lemma 
implies that there exists A = (Ai, .., A„), /i = (//i, /i m ), and a such that: 

/M (°\ 

\ T M + n + a : = : , (4) 

\i/ V o / 

Yht + a > 0, (5) 

i 

\i,Hj > 0. 
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Note that a is not constrained to be non-negative and that in (|5|) the expres- 
sion is constrained to be positive, rather than negative. These are essentially 
trivial variations from the standard form of Farkas' lemma. The first is be- 
cause the inequality corresponding to a is in fact an equality, and the second 
is because the unsatisfiable constraints are of the form • • • > t, rather than 
• • • < t. 

Letting p = A/ J2i K ® and (|) imply 



p T M < 




which contradicts our assumption. 



□ 



6.3 Lower Bounds 

The discussion before the proof, in our original context, means that once we 
choose p, determining A, the adversary can choose a specific input x, rather 
than a distribution of inputs, to give the worst case expected behavior for A. 
Alternatively, if the adversary chooses an input distribution first, then we 
can do our best by subsequently choosing the best deterministic algorithm, 
rather than a randomized algorithm, for this input distribution. Thus the 
min-max theorem implies that if the adversary can choose a distribution 
for which she can force any deterministic algorithm to have an expected 
complexity of at least t, then for every randomized algorithm there is an 
input such that the expected complexity of the algorithm on that input is 
at least t. 

To show a lower bound on the decision tree complexity of any randomized 
algorithm, then, it suffices to show the same bound on the complexity of 
all deterministic algorithms for some fixed input distribution. This is the 
technique we will use. 

As a starting point, recall that D^{f) denotes the minimum expected 
decision tree complexity of a randomized algorithm for /. An almost trivial 
observation is 

D fl (/)>max{Z>o(/),0i(/)}- (6) 

This follows because when f(x) = i, any algorithm must look at at least 
Di(f) bits of x before it can be sure f(x) = i. Last lecture we showed that 
for (non-trivial) weakly symmetric /, Do(f)Di(f) > n. Thus Dr(/) > ^Jn. 
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Next we give a lemma which uses the min-max theorem to give a better 
lower bound for certain types of functions. 

Lemma 6.3 (Yao) Let f : {0, l} n — > {0,1}. Suppose we can partition 
{1, n} into S\, S r so that \Si\ > t and 

f M = f° ifVi\{3--Xj = 0}nSi\>t, 
J[X) \1 if3i\{j :Xj = 0}nSi\ =0. 

Then D R (f)>tl(*). 

(Note that the condition on / leaves some values of / unspecified. Also note 
that this bound doesn't follow immediately from @.) 

Proof: We give an input distribution on x and give a lower bound on the 
expected complexity of any deterministic algorithm on this input distribu- 
tion. 

To generate x, choose t indices j uniformly at random from each Si and 
set Xj = 0. Set the remaining xy = 1. Then f(x) = 0, and for an algorithm 
to verify this it must query an Xj with the value from each Si. How many 
queries must the algorithm expect to make before finding such an Xj in a 
given Si? Since the t Xj with value were chosen randomly, we can view the 
algorithm as sampling randomly (without replacement) from a set of size k 
until it finds one of the t Xj chosen to be 0. The expected time for this is 
£l(k/t). By linearity of expectation, the expected time to find a in each S{ 
is thus at least (jfj = D,(n/k). □ 

We will apply this lemma to lower bound the randomized complexity 
of monotone bipartite graph properties on 2n nodes (n in each part). The 
complexity of such properties is not quite solved. The trivial bound Dn(f) > 
y 7 ^inputs gives a lower bound of n. This was first improved to ralogn by 
Yao, who introduced the general techniques for the problem. Valerie King 
improved the bound to n 5//4 , and subsequently Hajnal improved the bound 
to n 4 / 3 . 

6.3.1 Graph Packing 

graph packing Our problem is closely related to the problem of graph packing. Two graphs 

G\ and G2 on n nodes can be packed if (possibly after relabeling the vertices) 
the edge sets of the graphs don't overlap. If G\ and Gi pack, then Gi C G2. 
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Suppose G\ is a minimal G in a graph property Vf = {G : f(G) = 1}, 
and C?2 is a minimal G such that G € V . Observe that G\ and G2 don't pack. 
Furthermore, if any two Gi and G2 don't pack, we have |-E(G4)| \E(G2)\ > 
n 2 by essentially the same argument that showed D(f) > DQ(f)D\(f). More 
interestingly, one can show 

^max(G ! i)(i ma x(G2) > n /2- 

An instance of this is that if a graph G has d m { n (G) > n/2, then there exists 
a hamiltonian cycle C. That is, if d maiX (G) < n/2, then G and any cycle C 
pack. 



6.3.2 Yao's d mSLX /d Lemma 



Next we will in some sense specialize lemma 6.3 to the case of a monotone 
bipartite graph property. We will choose a minimal graph G for the property, 
and show how to construct a containing graph G' such that we can partition 
a large subset of the edges of G' into disjoint sets such that if we remove 
edges from the subsets, the property holds if we leave any set untouched, 
but fails if we remove at least a small number from each set. Lemma [T^ 
will apply to show essentially that any randomized algorithm has to check 
many edges in each set. 

Lemma 6.4 (Yao) Let f be a (non-trivial) monotone bipartite graph prop- 
erty on bipartite graphs with 2n nodes, the two parts U and W each having 
n nodes, and let V = {G : /(G) = 1}. 

If G is a graph in V with minimum c^ ax (G) (i.e. minimum maximum 
degree over vertices in U) and d (G) is the average degree of vertices of G 
in U , then 

0fl(/)>fi(l)%^. 

d (G) 

Proof: Of the G G V with minimum d max , choose one with the fewest 
number of maximum degree vertices, so that if any graph has fewer vertices 
of degree d max (and no higher degree vertices) it is not in V . 

Assume d max > Ad, otherwise the bound is trivial. Assume also that the 
vertices are labeled so that vertex is a maximal degree vertex in U, and 
vertices 1, ...,n/2 are in U and have degree at most h = 2d. 

We form the containing graph G' by adding edges from vertex i to the 
neighbors of vertex and vertex i + G' has two essential properties. First, 
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if we delete any 2h + 1 edges out of each vertex 0, 1, n/2, we destroy the 
property V, because we reduce the number of vertices in U with degree d max 
by 1. 

Second, if for some i (possibly 0), for each vertex j = 0, 1, — 1, i + 
l,...,n/2, we delete the edges from j into To — Tj — we preserve V. 

This is because we can permute the vertices of U so that the permuted graph 
contains the original graph G. We do this as follows: shift vertex to vertex 
1, vertex 1 to vertex 2, and vertex i to vertex 0. 

To apply lemma |6.3| , we define Si = {i} x (To — Tj — Ti+x) (for i = 
1, ...,n/2), S = {0} x (r - ri), and S = U l S l . 

Then if we obtain /' by restricting the domain of / to graphs which agree 
with G' on edges not in 5, /' is a function of |5| > n (d max — 2h) variables. 
If we start with G' and within each partition Si delete 2h + 1 edges, /' 
becomes 0, but if we start with G' and delete edges within S leaving at least 



one partition complete the function stays 1. Thus lemma 6.3 implies that 



max 



□ 
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7 Randomized Decision Tree Complexity, contin- 
ued 

In this lecture we continue giving lower bounds on randomized decision tree 
complexity, combining the various bounds we have developed to show a 
lower bound of f2(n 5//4 ) on the randomized decision tree complexity of any 
non-trivial, monotone, bipartite graph property. 

The general method is to choose both a minimal graph with the property 
and a graph whose complement is minimal in the complementary property, 
and then to use the fact that the two graphs don't pack to get constraints 
on the maximum and average degrees of the two graphs, and finally to apply 
Yao's lemma from last lecture to bound the randomized complexity via the 
constraints on the degrees. 

7.1 More Graph Packing 

Recall that graphs G\ and G2 pack if after some relabeling of the nodes of 
G\ the two graphs have no common edges. That is, ^G' 1: G\ = G\, G[ C G2. 

Recall that d max (G), for a bipartite graph G = (U,W,E), denotes the 
maximum degree of a vertex in U in G, and d{G) denotes \E\/n, the average 
degree of a vertex in G. For this lecture, we will restrict our attention to 
bipartite graphs G\ = (U, W, E\) and G2 = (U, W, E2) with equal size parts, 
i.e. \U\ = \W\ = n, so that the average degree in each part is the same. 

We will show the following lemma. 

Lemma 7.1 (Bollobas-Eldridge) // 

4L(Gi)Cx(G 2 ) + 4L(G 2 )Cx(Gi) < n 
then G\ and G2 pack. 

There is also a non-bipartite version of this lemma: 
Lemma 7.2 For two graphs G\ and G2, if 

rfmax(Gl)rfmax(G2) < n/2 

then G\ and G2 pack. 

The proof of the second lemma, which we omit, is similar to that of 
the first lemma, which we give. While there is no gap in the n/2 term; the 
conjecture is that (cZ max (G ! i)-|-l)((i max (G ! 2) + l) < n also gaurantees packing. 
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Figure 8: Swapping the Labels of u and u': Two Cases. 



Proof: (Lemma |7.1| .) Suppose G\ and G2 don't pack, and we have 
relabeled the vertices of G\ so as to minimize the number of overlapping 
edges. There is some overlapping edge (u,w). Consider swapping the labels 
of u and a vertex u' G U in G\. With the current labeling, there is at least 
one overlapping edge out of u and u' , so after the swap this must also be 
the case. 

This entails that either an edge (u,w') G E\ will be mapped on to an 
edge (u' , w') S E2 by the swap, or that an edge (u' , w') £ E\ will be mapped 
onto an edge (u,w') £ E2 by the swap. This means that every u' ^ u is 
reachable from u either by following an edge in G± and then an edge in G2 
or by following an edge in G2 and then an edge in G\. 

There are at most d max (Gi)d m / ax (G2) — 1 paths of the first kind, (one of 
the candidates leads back to u), and similarly at most d max (G l 2)d m / ax (G'i) — 1 
of the second kind. Thus 

<OGi)C*(G 2 ) + OG 3 )Cx(Gi) > n + 1. 



□ 

Graph packing captures many graph theoretic notions, for instance if G\ 
is a K<i (a graph with a d-clique and n — d isolated nodes), and G2 consists 
of d n/d cliques, then G\ and G2 pack, and this is equivalent to saying the 
rf-colorable vertices of G\ are ci-colorable^] with each color coloring n/d nodes. In fact 

it can be provedPI that if G\ is of maximal degree d — 1, then G± and G2 
pack. 

On the other hand, if G\ instead consisted of a d + 1-clique and n — d — 1 
isolated nodes, then G\ and G2 would not pack, since G\ would not be d- 
colorable. Since the product of the maximum degrees for this pair of graphs 

21 The vertices of a graph are d-colorable if one can color them with d colors so no edge 
touches two vertices of the same color. The edges of a graph are d-colorable if one can 
color them with d-colors so no vertex touches two edges of the same color. 

22 Hajnal and Steveredi? 
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is d(n/d — 1) = re — d, which for d = re/2 is re/2, this gives a tight lower 



bound for lemma |7.1| and the conjecture. 



7.2 Application of Packing Lemma 

So we have this packing lemma, which is fairly straightforward; how can we 
use it? 

Given a monotone bipartite graph property Vf on bipartite graphs G = 
(U,W,E) with \U\ = \W\ = n, choose G\ to have the lexicographically 
smallest degree sequence^] of vertices in U, so that G\ is minimal, no graph degree sequence 
in V has lesser <i max , and of those with equal d max , none has fewer vertices 
of this degree. Similarly, choose G2 lexicographically smallest with G2 V. 

We know G\ and G2 don't pack, otherwise G2 C G\ G V . 

We have several bounds on Dr(/): 

D R (f) > d(G{)n, (7) 

D R {f) > c d ^ Gl) n. (8) 
d(Gi) 

Bound (Q) says that Dr is at least the number of edges in Gi, which is 
trivial since Gi is minimal. Bound (|8|) is Yao's lemma, shown in the previous 
lecture. It also holds if G2 replaces G\ and/or W replaces U, a fact we shall 
use. 

Bound (0) implies that d(G\) < Dr/u, and thus (g) implies 



d 



max (G 1 )<^d(G 1 )<l(^\ 
cn c V re / 



■u 



and similarly for W and G2 possible replacing U and G\, respectively. Since 
G\ and G2 don't pack, lemma |7.1| implies that 



72 (~~) - ^max(G'l)(i m / ax (G2) + ^ma x (G , 2)^ma x (G'l) > U, 



c 2 \ n J 



/ o 5\l/4 , 

which in turn implies that Dr > ( ^-f-J = 0(n 5 ' ). 

Thus any non-trivial bipartite graph property has Dr = f2(re 5 / 4 ). 



23 The degree sequence of a graph is the list of degrees of the vertices of the graph, from 
largest to smallest. 
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7.3 An Improved Packing Lemma 

To improve this result, we need an improved packing lemma: 

Lemma 7.3 Let G\, G 2 be bipartite graphs. If 

^ u jG 1 )d{G 2 ) < ^, (9) 
4Lc(G 2 )d(Gi) < ^, (10) 



and 



d max( G lMm ax (G 2 ) < YoOOlogn' ^ 



then G\ and G2 pack. 



(The condition ( |lT| ) is a technical condition, needed for the proof but not 
truly a restriction. In particular, if (|l~T| ) is violated, we will see that Yao's 
lemma gives an immediate lower bound of Q(n 3 / 2 /v / logn) on Dr.) 
Proof: This proof is somewhat more complicated, we sketch the proof. In 
particular, we omit some final computations. 

We have Gi = (U u Wi,E{) and G 2 = (U 2 ,W 2 ,E 2 ). We will assume the 
above conditions, and show that if we fix a random relabeling / of W\, with 
non-zero probability there is a relabeling g of U% so that G% relabeled by / 
and g shares no edges with G 2 . 

In spirit the idea is initially similar to the previous packing lemma. There 
we showed that from the standpoint of a given vertex u, if after ruling out 
neighbors of neighbors of u there was a vertex left, we could swap the labels 
of u and the vertex and possible reduce the number of edge overlaps. Thus we 
showed roughly that the product of the maximum degree in U and maximum 
degree in W was at least n. 

Here, since the vertices of W\ have been randomly mapped onto the 
vertices of W 2 , the neighbors of u in G\ are mapped onto an essentially 
random set of size at most d max {G\) in W 2 - Since the set in W 2 is essentially 
random, we will be able to show (using the technical condition) that the size 
of its neighbor set is 

c\W 2 \d{G 2 ) = co! max (Gi)d(G 2 ) = cn/100 

with probability at most Thus with probability at least 1/2, for each u 
there will be less than n/2 u' ruled out as possible images under g. We will 
also show that the existence of g is equivalent to the existence of a perfect 
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matching connecting each u with a possible image u', and thus the existence 
of n/2 possible images of each u is sufficient to guarantee the existence of g. 

So fix a relabeling / of W\ uniformly at random. When will there be a g 
relabeling U\ so that no edges are shared? The constraint is that if an edge 
(u, v) is in E\, then the edge (g(u), f(v)) is not in E<i- Thus for each u £ U\ 
we must find a v! G U\ such that N G2 {v!) n f(N Gl (u)) is empty. (N G (v) 
denotes the neighboring vertices of v in G.) The only additional constraint 
is that each u have a unique such v! . 

In other words, if we define a bipartite graph H = (Ui,Ui,F), where 

F = {(u,u'):N G2 (u')nf(N Gl (u))=®}, 

then g exists iff F has a perfect matching. 

The Frobenius-Konig-Hall theorem states that a bipartite graph G = 
(U, W, E) has a perfect matching iff for every vertex set X C U we have 
l-^X-^OI > 1-^1 • We don't use this in full generality, rather we use a conse- 
quence. Namely, if d m in(G) > n/2, then G has a perfect matching. This 
follows from the FKH theorem as follows: any set X violating is of size at 
greater than n/2. But then any vertex in W has some edge into X, since 
U — X is not big enough to contain all the edges out of any vertex in W. 
Thus all vertices in W are neighbors of X. 

(Just for fun, note that we can also use the previous lemma to show 
this. Namely if d m i n (G) > n/2, then G (with d max (G) < n/2) and a perfect 
matching (with d max (G) = 1) pack.) 

Now F is a random graph, but not in the usual sense. We will argue 
that with non-zero probability the minimum degree of F is at least n/2, so 
that it has a perfect matching, and g exists. 

To bound the minimum degree of H from below, we ask how many edges 
from a vertex u can be excluded. An edge (u, v!) is excluded if there is a 
(u,w) G Ei with (v! , f(w)) G £2- The idea is that the the number of such 
(u,w) is bounded by d^^Gi), while for a given w the number of such 
(u',w) is around d(G2), so that for a given u, the number of excluded u' 
(which must be at least n) is at most around the product of these two. (By 
reversing the roles of Gi and G2, we can bound the number of edges (u,u') 
into a given v! which are excluded, thus ensuring dn(u') is also at least n/2 
for vertices in the second part of H.) 

By the definition of F, 



#{n':(n,n')0F} < \N G , (f (N Gl (u)))\ 
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< E d G2{u). (12) 

wef(N Gl {u)) 

Thus the probability that dn{u) < n/2 (for u in the first part of the 



bipartite graph) is bounded by the probability that (12) is greater than 
n/2. (The case for the second part is similar, and we omit it.) The point 
is that f (Nq 1 (u)) is essentially a random subset of W2 of size at most 
dj^ ax (Gi), so the sum of the degrees of vertices in W2 should be bounded 
by O (d(G*2)(imax) with high probability. In the full proof one shows that 
for a given u the probability of Dh(u) > n/2 is at most so that the 
probability of all u having degree less than n/2, and thus of a matching, 
and the consequent g, existing, is at least 1/2. 

Here we show exactly what computations we are leaving out: If we define 

nd(G 2 ) 

then ^2i u i = 1) u i ^ 0> an d we want to bound the probability that 

^ Ul> 26' 

where 6 = d{G2) and S is a set chosen uniformly at random from sets of 
some size at most d^ ax (Gi), which is bounded by by hypothesis. 

The average value of the Ui is 1/n, so the expected value of the sum 
is jj^g- Thus unless the cj{ are highly concentrated, which the technical 
condition d^ x (Gi) < 1000 " og - prevents, the condition will hold. We omit 
the details of the computation. q 

With this improved packing lemma, and the conditions (as before) 



Dr 



n 



> did), (13) 



D R > ^max( g l) / 14 n 

n ~ d{G x ) K ' 

(and the corresponding conditions with W and G2 possibly replacing U and 
G\, respectively), we can show an improved bound. (Recall that the first 
condition is essentially the trivial lower bound on Dr, while the second is 
Yao's lemma.) 
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Specifically, if the technical condition <£^ &X (G\) < 1000 " og - (or any of the 
equivalent technical conditions) of the improved packing lemma are violated, 

then by conditions @ and (@) ^1 > (^f^), so D R = ft(n 3 / 2 / yiogn). 

Otherwise (as G\ and G2 don't pack) one of the other conditions is vio- 
lated. We assume without loss of generality that it is (|9|): d^ iax (Gi)d(G2) < 
ji^j. Together with the above two conditions, this gives 

(^) 3 >cd(G 2 )< ax (Gi)>^. 
Thus D R = ft(ra 4 / 3 ). 

It seems possible that this bound could be pushed a bit higher, perhaps 
to n 3 / 2 . Currently this is the best lower bound known, and the best upper 
bounds known are Q(n 2 ). 
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8 Randomized Complexity of Tree Functions — 
Lower Bounds 

For any non-trivial monotone graph property Vf on graphs with n nodes we 
have seen that D(f) = ft(n 2 ), D R (f) = 0(n 3 / 2 ). 

In this lecture we discuss tree functions — functions with formulas in 
which each variable occurs exactly once. We already know that for any tree 
function /, D(f) = n, and we previously saw a tree function fo (represented 
by a complete binary tree with alternating and and or-gates) with Dji(fo) = 
O(n - 75 -). 

We show a lower bound on £>/?(/) for tree functions, which we use to 
deduce that 

• D R (f ) = fl(n - 75 -), and 

• D R (f) = f2(n - 51 ) for any tree fn. /. 

8.1 Generalized Costs 

The most natural thing to consider for proving a lower bound on the com- 
plexity of a tree function f(x, y) = g{x)oh{y) (with x G {0, l} n ~\ y G {0, 1}*, 
and o G {A, V}) is a top-down induction. Unfortunately we don't know how 
to get this to work. 

Instead, Saks and Wigderson have looked at a bottom-up induction, in 
which a gate with two immediate inputs is replaced by a single input. First 
/ is expressed as f(x,y,w) = f'{xoy,w) (with x G {0,1}, y G {0,1}, 
w G {0, l} ra ~ 2 , and o G {A, V}), and then a lower bound on / is given by a 
corresponding lower bound on f'(v,w). 

For this technique to work, we need to keep track of the fact that dis- 
covering v, which represents x o y, is somehow more expensive than just 
querying a bit. To do this, we generalize our notion of cost. We associate 
two costs co(xi) and c\{xi) with each variable Xj which represents a bit of 
the input to /. The cost cq represents the cost to discover that Xi is 0, while 
the cost ci represents the cost to discover that Xj is 1. With such a cost 
function c, we define 

DrU^c) = minmax^p T 5(T,x,c), 

P x T 

where 5{T,x,c) = T,ieS,x i= o c o(^) + T,ieS,xi=i cifai), with S the set of vari- 
ables queried by T on input x. 
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So we are given a function / with a set of costs c. To show a lower bound 
on Dji(f,c) we choose the function /' so f(x,y,w) = f'(x o y,w), and we 
choose a set of costs d for the inputs of /' such that we can show Dn(f, c) > 
Djt(f ,c'), thus inductively generating a lower bound for Dr(/, c). We will 
assume that o = A; the case o = V is symmetric. 

So, leaving the choice of d unspecified as yet, we have /, c, /', and d . We 
want to show Ar(/, c) > Ar(/', d). The min-max theorem says Dr(/, c) is 
also equal to 



that is, we can also obtain Dji(f, c) by choosing the worst input distribution, 
and then the best deterministic algorithm for that distribution. Thus to 
show Dji(f, d) is at most Dr(/, c), we will assume a worst case distribution 
q* of inputs to /', and we show that there exists a T" for /' such that 



So we have the worst case distribution q* for /', and we want to show 
the existence of a T" that does well on q* . We will map q* to a distribution 
q on the inputs to /, so that there exists an algorithm T* for / such that 
Yj X Qx5(T*, x, c) < Dji(f,c). We know such a T* exists because q is at 
worst the worst-case distribution for /, in which case there still exists an 
algorithm with expected cost exactly Dn(f,c). We will then construct T" 
based on T*, so that their expected costs on their respective distributions 
can be correlated. 

The basic idea will be that T'(y, w) will mimic T*(x, y, w) for some choice 
of x and y such that x Ay = v. When T* checks a variable in w, T' will do 
the same. When T* checks x or y, T' may or may not check v. 

What should the costs d be? For variables other than v, d will agree 
with c. We will wait to determine d (v) and d 1 (v), choosing them as large 
as our proof techniques will allow. 

What about the distribution q? We define q(l, 1, w) = q*(l, w), q(0, 1, w) = 
p x Q*(0, w), and g(l, 0, w) = p y q*(0, w), where p y and p x = 1 — p y will be de- 
termined later. 

What about T'l Define T' y on input (v,w) to mimic T* on (l,v,w) and 
T' x on input (v, w) to mimic T* on (v, l,w). Define T' on input (v, w) to run 
Ty with probability p y and T' x with probability p x . That T 1 is a randomized 
strategy is no problem; since q* is fixed one of the two deterministic strategies 
T y or T' x will be at least as good. 

We need to find the constraints on d 1 (v) and d (v ) which will allow us to 
show that the expected cost of T* is at least that of T' . For this it suffices 




X 



E x q*AT',x,d)<D R (f,c). 



8 — Randomized Complexity of Tree Functions — Lower Bounds 



53 



to show that the cost of T*(v,w) for any v and w is at most p y times the 
cost of T*(l, v, w) plus p x times the cost of T*(v, l,w). 

8.2 The Saks-Wigderson Lower Bound 

Now that we have the form of our argument, the rest is essentially a matter 
of checking cases. We should note that although most of the choices we have 
made above are straightforward, there is one choice which in fact anticipates 
in a clever way what we will need in the remaining part of the proof. In 
particular, we have chosen T' to run T' x or T' y with probability p x or p y , 
respectively, and, not coincidentally, we have chosen the distribution q to 
map (0, w) to (1,0, w) or (0, l,w) with probability p x or p y , respectively, as 
well. This choice bears some consideration. 

Returning to our argument, we want to find the conditions under which 
the cost of T'(v, w) is at most p y times the cost of T*(l, v, w) plus p x times 
the cost of T*(v, 1, w). We consider the various cases for T* , v, x, and y. 

If v = 1, this reduces to the cost of T'(l,w) being at most the cost of 
T*(x = l,y = l,w). If T* queries neither x nor y, then this is clear, since 
T x , Ty, and T* query exactly the same variables. Otherwise, if T* queries 
only x, then T* pays c\{x) while T' pays an expected cost of p x d\{v) (the 
costs to query variables in w are again the same); if T* queries only y, then 
T* pays c\(y) while T" pays an expected cost of Pycf^v); if T* queries both 
x and y then T* pays c\(x) + c\(y) while T' pays d^v). Thus T* will pay 
at least what T' pays provided 

• p x Ci(v) < ci(x), and 

• P y d\{v) < ci(y). 

The case v = is a bit more complicated. In this case, we want the cost 
of T'(0, w) to be at most p y times the cost of T*(l, 0, w) plus p x times the 
cost of T*(0,l,w). If neither queries x or y for this w then this is clear. 
Otherwise, both query x first or both query y first. We consider the case 
when x is queried first, the other case being symmetric. 

There are then two cases, depending on whether or not T*(l,Q,w) queries 
y as well as x. (T*(0, 1, w) will not query y, since T* is optimal and knows 
the value of x Ay after querying x.) First assume that only x is queried by 
T*(l, 0, w). Then with probability p y the cost to T' is the cost to T*(l, 0, w) 
minus c±(x), and with probability p x the cost to T' is the cost to T*(0, 1, w) 
minus cq(x) plus c' (v). Thus we are fine provided 
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-p y ci{x) -p x Co(x) + p x c' (v) < 0. 



Next assume that x and y are queried by T*(l,0,w). Then with prob- 
ability p y the cost to T' is the cost to T*(1,0, w) minus c\(x) minus co(y) 
plus Cq(v), and with probability p x the cost to T' is the cost to T*(0, l,w) 
minus co(x) plus c' (f). Then we are fine provided 

• Py(-ci{x) -c (y) +c' {v)) + p x (-c (x) + c / (v)) < 0. 

Collecting all of these inequalities, and the symmetric inequalities for y 
queried first, we have that Dn(f,c) > Dji(f',c?) provided that c' (t>) and 
c?i(v) satisfy the constraints: 



c[(v 
c' (v 
c' (v 
c' (v 
c' (v 



< ci(x)/p x , 

< ci{y)/p y , 

< PyC 1 (x)/p x + Cq(x), 

< PxCi{y)/p y + c (y), 

< p y {ci{x) +co(y)) +p x c (x), 

< Px(ci(y) +co(x)) +p y c (y). 

ci(x) 



Choosing d^v) = c 1 {x)+c 1 {y) forces p x = Cl £)Xci{y) and Py 
and yields 



c 1 (x)+c 1 (y) 



Theorem 8.1 (Saks-Wigderson) Let f be a tree function with binary A 
and V gates. Then Dji(f) > max{/°(/), where 

l°(x i )=l\x i ) = 1, 

Pig Ah) = l 1 (g)+l 1 (h), 

l\ 9 Ah) = min o (g ) + Hh) , m + fiih) , H9)m >, 



l°(gVh) = l°(g)+l°(h), 

iHgVh) = mm{l 1 (g)+l (h),l°(g)+l 1 (h), 



l\g)+nh) 



l°(g)l 1 (g)+l\h)l (h)+l (g)l°(h) 
l°(g)+l°(h) 



}• 



Applying this to the function /o (the alternating and/or function men- 
tioned at the beginning of the lecture) shows that the upper bound for that 
function is in fact tight. 

Rafi Heiman and Avi Wigderson generalize this theorem to tree func- 
tions with arbitrary fan-in gates to show a lower bound of f2(n°- 51 ) for the 
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randomized decision tree complexity for an arbitrary tree function. See 
Randomized vs. Deterministic Decision Trees — Complexity for Read Once 
Boolean Functions by Ran Heiman and Avi Wigderson. 



(Lecture by Ran Heiman.) 
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